Data Mining Career Batting Performances in Baseball

نویسنده

  • David D. Tung
چکیده

In this paper, we use statistical data mining techniques to analyze a multivariate data set of career batting performances in Major League Baseball. Principal components analysis (PCA) is used to transform the high-dimensional data to its lower-dimensional principal components, which retain a high percentage of the sample variation, hence reducing the dimensionality of the data. From PCA, we determine a few important key factors of classical and sabermetric batting statistics, and the most important of these is a new measure, which we call Offensive Player Grade (OPG), that efficiently summarizes a player’s offensive performance on a numerical scale. The determination of these lower-dimensional principal components allows for accessible visualization of the data, and for segmentation of players into groups using clustering, which is done here using the K-means clustering algorithm. We provide illuminating visual displays from our statistical data mining procedures, and we also furnish a player listing of the top 100 OPG scores which should be of interest to those that follow baseball.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Understanding Career Progression in Baseball Through Machine Learning

Professional baseball players are increasingly guaranteed expensive long-term contracts, with over 70 deals signed in excess of $90 million, mostly in the last decade. These are substantial sums compared to a typical franchise valuation of $1-2 billion. Hence, the players to whom a team chooses to give such a contract can have an enormous impact on both competitiveness and profit. Despite this,...

متن کامل

The Effects of Two Types of Training on the Physical Ability of University Baseball Players

Background. Training in any sport aims to maximize athletes’ physical capacity. Objectives. This study aimed to determine the effects of two training programs, functional training, and weight training, on the physical capacity of university baseball players. Methods. The participants included 10 university baseball players, divided into the functional training group (FTG, n=5) and the weight ...

متن کامل

Transfer of Training from Virtual to Real Baseball Batting

The use of virtual environments (VE) for training perceptual-motors skills in sports continues to be a rapidly growing area. However, there is a dearth of research that has examined whether training in sports simulation transfers to the real task. In this study, the transfer of perceptual-motor skills trained in an adaptive baseball batting VE to real baseball performance was investigated. Eigh...

متن کامل

Age and Level of Performance in Major League Baseball

The relationship between age and the level of performance of major league baseball players was assessed through quasi-experimental designs. Whereas cross-sectional comparisons revealed no differences in batting and fielding statistics between younger and older players, longitudinal analysis showed significant decrements in batting performance as players aged from 30 to 35 years. A decline in pe...

متن کامل

Implicitly Defined Baseball Statistics

Major League Baseball uses statistics to determine awards every season. The batting champion is given to the player with the highest batting average. The Cy Young Award is given to the top pitcher which is determined by many different statistics including earned run average (ERA). Batting average and ERA have been used for many years and are major statistics in baseball. Neither batting average...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012